Sentence-level MT evaluation

نویسندگان

Michael Gamon

Anthony Aue

Martine Smets

چکیده

In this paper we investigate the possibility of evaluating MT quality and fluency at the sentence level in the absence of reference translations. We measure the correlation between automatically-generated scores and human judgments, and we evaluate the performance of our system when used as a classifier for identifying highly dysfluent and illformed sentences. We show that we can substantially improve on the correlation between language model perplexity scores and human judgment by combining these perplexity scores with class probabilities from a machine-learned classifier. The classifier uses linguistic features and has been trained to distinguish human translations from machine translations. We show that this approach also performs well in identifying dysfluent sentences.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Back-translation Score: Automatic MT Evaluation at the Sentence Level without Reference Translations

Automatic tools for machine translation (MT) evaluation such as BLEU are well established, but have the drawbacks that they do not perform well at the sentence level and that they presuppose manually translated reference texts. Assuming that the MT system to be evaluated can deal with both directions of a language pair, in this research we suggest to conduct automatic MT evaluation by determini...

متن کامل

The Backtranslation Score: Automatic MT Evalution at the Sentence Level without Reference Translations

متن کامل

Using Machine Translation Evaluation Techniques to Determine Sentence-level Semantic Equivalence

The task of machine translation (MT) evaluation is closely related to the task of sentence-level semantic equivalence classification. This paper investigates the utility of applying standard MT evaluation methods (BLEU, NIST, WER and PER) to building classifiers to predict semantic equivalence and entailment. We also introduce a novel classification method based on PER which leverages part of s...

متن کامل

Regression for Sentence-Level MT Evaluation with Pseudo References

Many automatic evaluation metrics for machine translation (MT) rely on making comparisons to human translations, a resource that may not always be available. We present a method for developing sentence-level MT evaluation metrics that do not directly rely on human reference translations. Our metrics are developed using regression learning and are based on a set of weaker indicators of fluency a...

متن کامل

Normalized Compression Distance as automatic MT evaluation metric

This paper evaluates a new automatic MT evaluation metric, Normalized Compression Distance (NCD), which is a general tool for measuring similarities between binary strings. We provide system-level correlations and sentence-level consistencies to human judgements and comparison to other automatic measures with the WMT’08 dataset. The results show that the general NCD metric is at the same level ...

متن کامل

Towards a Predicate-Argument Evaluation for MT

HMEANT (Lo and Wu, 2011a) is a manual MT evaluation technique that focuses on predicate-argument structure of the sentence. We relate HMEANT to an established linguistic theory, highlighting the possibilities of reusing existing knowledge and resources for interpreting and automating HMEANT. We apply HMEANT to a new language, Czech in particular, by evaluating a set of Englishto-Czech MT system...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2005

Sentence-level MT evaluation

نویسندگان

چکیده

منابع مشابه

The Back-translation Score: Automatic MT Evaluation at the Sentence Level without Reference Translations

The Backtranslation Score: Automatic MT Evalution at the Sentence Level without Reference Translations

Using Machine Translation Evaluation Techniques to Determine Sentence-level Semantic Equivalence

Regression for Sentence-Level MT Evaluation with Pseudo References

Normalized Compression Distance as automatic MT evaluation metric

Towards a Predicate-Argument Evaluation for MT

عنوان ژورنال:

اشتراک گذاری